Emergent digital bio-computation through spatial diffusion and engineered bacteria

Biological computing is a promising field with potential applications in biosafety, environmental monitoring, and personalized medicine. Here we present work on the design of bacterial computers using spatial patterning to process information in the form of diffusible morphogen-like signals. We demonstrate, mathematically and experimentally, that single, modular, colonies can perform simple digital logic, and that complex functions can be built by combining multiple colonies, removing the need for further genetic engineering. We extend our experimental system to incorporate sender colonies as morphogen sources, demonstrating how one might integrate different biochemical inputs. Our approach will open up ways to perform biological computation, with applications in bioengineering, biomaterials and biosensing. Ultimately, these computational bacterial communities will help us explore information processing in natural biological systems.

be the lowest concentration as we cannot reduce the concentration by adding an input.Similarly, the 11 state will always be the highest concentration seen by the receiver.However, the ordering of the 01 and 10 states will depend on the distance between the receiver and each of the inputs; when the are equidistant, 01 and 10 produce the same concentration seen by the receiver.However, if a receiver is one space away from input A and two spaces away from B, 10 will have a higher concentration than 01 and the order will be as above.If instead the receiver is next to B rather than A, the concentration seen by the receiver when B is on is higher than when A is on, and so the order will switch to 00, 10, 01, 11.The relative concentrations of the input states, and therefore their order, can change depending on the relative position of the receiver and the inputs (Figure 3A).
There are some constraints on the relative concentrations of signalling molecules which must hold for all possible receiver positions.For two inputs these are: where I ab is the concentration of signalling molecule at the receiver for input state ab.This imposes constraints on the allowable orderings of the input states.

Activation functions partition the state space
With input states placed on a dimension of increasing signal concentration, we can conceptualise the activation functions engineered into the receiver cells as partitioning the state space into ON and OFF regions.For example, the highpass function remains in an OFF state at signal concentrations below a threshold, and switches to an ON state at signal concentrations above that threshold.Any input states which are below that threshold will be OFF and any above will be ON.The bandpass produces three partitioned regions rather than two; OFF at low signal concentrations, ON at medium concentrations, and OFF again at high concentrations.The lowpass and bandstop are inversion of the highpass and bandpass.
There are some constraints on how the state space can be partitioned by the activation functions.The 00 state must be OFF for the highpass and bandpass, otherwise these would be constitutively on and lowpass functions respectively.Similarly the 00 state must be on for the lowpass and bandstop, otherwise these would be constitutively off and highpass respectively.However, there is no comparable constraint for the 11 state.For example, the highpass can be OFF in the 11 state if the two signal sources are too far away to produce a signal over the concentration threshold.

All two-input logic gates can be realised with a single receiver
Once we know the allowable orders of the input states and how our activation functions partition that space, we can demonstrate how this paradigm can produce two-input logic gates.First we take the highpass function (OFF|ON).For each admissible position of the partition boundary we will enumerate all the digital functions that can be obtained by swapping the order of input states.There are five possible positions of the partition boundary we need to consider.|00, 10, 01, 11 is not possible for the reasons stated above.For the 00|01, 10, 11 position we can see that we obtain an OR logic gateoutput is on with either or both inputs are on.If we swap the order of 10 and 01 -00|10, 01, 11-we still produce an OR logic function as the same input states are above the ON threshold.For the next boundary position; 00, 01|10, 11 is distinct from 00, 10|01, 11, so we gain two logic gates -A and B. Similarly to the OR gate, 00, 01, 10|11 and 00, 10, 01|11 both produce and AND gate, and 00, 01, 10, 11| and 00, 10, 01, 11| both produce and OFF gate.This means that with the highpass we have a total of five logic gates: OR, A, B, AND, OFF (Supplementary Figure 1A).With the lowpass we have the same boundary positions, but with the inverse mapping (ON|OFF) and through similar arguments, or through inversion of the five highpass logic gates, we gain another five gates (NAND, NOT A, NOT B, NOR, ON) (Supplementary Figure 1B).This means that with the highpass and lowpass we can achieve 10 of the 16 two input logic gates.
In order to achieve all sixteen two-input logic gates, we require the bandpass and bandstop as they are able to isolate states at intermediate concentrations.Without this capability, we could not perform IMPLY or XOR logic.Interestingly, we could perform XNOR and IMPLY logic using the methods described later for multiple receivers.You can also note that the bandpass and bandstop replicate all of the logic gates that can be encoded by the highpass and lowpass respectively.We retain the highpass and lowpass in our set of activation functions as they are easier to engineer and may prove more robust for certain functions.

Generation of valid orders of input states
For two-input logic, there are 4! = 24 ways of ordering the input state, but only two of those orderings meet our constraints: 00, 01, 10, 11 and 00, 01, 10, 11.For three-input logic these numbers explode; there are now eight input states rather than four (000, 001, 010, 011, 100, 101, 110, 111), leading 8! = 40320 possible orderings.We have similar constraints to the two-input case: where the set of states with one and two inputs ON will be labelled as I = {001, 010, 100} and II = {011, 101, 110} respectively.Furthermore, in the three input case there are additional higher order constraints, for example if group I is ordered such that, A 100 < A 010 < A 001 this imposes a constraint on group II, A 110 < A 101 < A 011

Algorithmic generation of valid orders
We developed a method to generate all the allowable orders of input states for a given number of inputs.Here an input state is represented by a string of letters, so that we don't have to type out long strings of 1s and 0s, where each letter represents an input that is turned on.All inputs included in the string are on and those not included in the string are turned off.For example ADE could be an input state for a five-input gate where input A,D and E are turned on and inputs B and C are turned off.The method generates all possible sequences of input states from lowest to highest signal concentration.The algorithm uses the existing information from input states previously assigned to a given sequence to generate the subsequent states.This ensures that no invalid sequences are generated.
The methods is split into two stages, the first stage is iterative and works as follows.At each step the function considers an incomplete sequence of input states and generates all valid options for the input state at the next position.The valid options for the next state are found by combining states already in the sequence and for each possible combination we record all indices of the input states which can produce that combination.This works as follows (Algorithm S1)

A B C D
Supplementary Figure 1: Activation function partitioning of input state space for two-input logic.There are two allowable orderings of the input states in order to maintain the constraint that 00 is lowest and 11 is highest.For each of these orders, we can partition the states into ON and OFF using one of the four activation functions.(A)Highpass can produce five of the sixteen two-input logic functions.(B) Lowpass produces the inverse of the highpass functions.(C) Bandpass and (D) bandstop allow us to produce all sixteen two-input logic functions.
so that the next position in the sequence is considered as the second state.For any combined state not already in the sequence we store the index of the first state and second state, if this combined state is found again by subsequent combinations we also store these indices.
2. We then increment the pointer to the first state by one and repeat until there are no more states to be combined and return the potential next states and the indices which were combined to produce them.
Algorithm 1 get next states The second part of the function is to apply constraints to the possible states generated in the previous algorithm (Algorithm S2).For some sequences the next possible states will be constrained by the existing sequence.For example, for the sequence 0, B, D, A, BD, C, AB, AD, two of the possible next states are BC and ABD.BC can be generated from combining the states at indices 1,5 (B and C) and ABD from states at 3, 4 (A and BD) and 2, 6 (D and AB).Now, just looking at BC from (1,5) and ABD at (3,4) we cannot exclude either state.However if we also consider the second pairing of (2,6) which also produces ABD we see that ABD cannot be the next in the sequence, because D > B and AB > C then ABD > BC and ABD cannot go next.

Algorithm 2 apply constraints
The final algorithm (Algorithm 3) combines both the previous algorithms.Starting from a list of the single input states, for a given number of inputs it will repeatedly apply Algorithms 1 and 2 to elements of a queue containing all possible orders until no more operations can be completed and all possible orders of states are generated.
Algorithm 3 generate valid orders

One receiver cannot realise all three-input digital functions
After the generation of orders, we can then apply the activation functions to get all the logic gates for 1 output.We apply every possible bandpass and bandstop function to the all generated sequences to find the logic functions which can be constructed with one receiver.There are 12 valid orders of input states for three-input logic.For the single partition activation functions there are 8 positions at which we can partition the state space.This results in 96 order-partition combinations, but only 19 of those result in unique logic (because, for example, 00, |01, 10, 11 is equivalent to 00, |10, 01, 11).For the bandpass and bandstop there are 29 ways to position the state space, resulting in 348 order-partition combinations, of which 76 result in unique logic.From this, it is clear that we cannot produce all 256 possible three-input logic functions if we rely on a single receiver colony to produce an output.Supplementary Table 2 contains the number of possible logic gates that can be generated with 1 to 5 inputs.
Supplementary Table 2 When the number of inputs increases to 5, the running time becomes too long and the memory consumption becomes too large, therefore the computation can only do up to 5 inputs 2 outputs case.
Based on the results in Table S3 we conjecture that the maximum number of receivers required for an n input function is n − 1, where the one-input case is a trivial corner case.However, we have yet to prove this.
The conjectured relative complexity of our approach was assessed by comparison with two other implementations of spatially distributed biocomputers [1,2].The first approach is based on the separation of cells into a set of connected growth chambers [1].The maximum number of independent growth chambers, named modules, was used for the comparison.A subsequent paper implemented similar circuits on branched 2D circuits on pieces of paper [2] and the maximum number of branches was used for comparison.Both of these quantities scale by 2 n−1 .It is shown in Supplementary Figure 2 that as the number of inputs increases the complexity of our approach scales favourably.This means that the distributed circuits considered here are capable of encoding complex digital functions with less biological complexity than the current state of the art.
Supplementary Figure 2: The conjecture result scales for larger number of inputs

The Macchiato algorithm: optimally distributed spatial circuits
The output of the Macchiato algorithm can then be used to calculate the required pattern of inputs and receivers using a calibrated finite difference model.Designing a spatial digital function, given a truth table, works as follows, where step 3 comprises the Macchiato algorithm itself (Figure 3C): 1.The truth table corresponding to the desired digital function is supplied as input to the Macchiato algorithm 2. The Macchiato algorithm: the following steps are repeated until all 1s in the output of the truth table have been assigned The output mapping is reordered to maximise a given block.The block to be maximised depends on the priority set by the user.For example, it is often favourable to maximise the number of input states that are mapped to ON by the highpass, in this situation the algorithm will first try to maximise the size of the upper rightmost block.
The maximised block is assigned to the don't care (DC) set, meaning that subsequent receivers can map them to either ON or OFF.
3. The blocks are assigned to receivers.The mapping of input states to output states by the receivers is the output of the Macchiato algorithm.
4. A spatial configuration of inputs and receivers that results in the truth table is determined using the calibrated finite difference model.If any of the green receivers are activated the output of the function is ON

Evolutionary algorithm for placement determination
The output of the Macchiato algorithm assigns a given logic gate to one or more receivers.An evolutionary algorithm is designed to find the optimal placement of receivers and inducers such that each receiver encodes its given function robustly.The Opentrons dispensing is arranged in a grid-like fashion, akin to a checkerboard, each grid cell can either contain an inducer or a receiver strain of bacteria, the latter of which may respond to one or more inducers.To determine the optimal placement of inducers and receivers, the algorithm simulates the diffusion of inducers and the response of the receivers over time.For each placement, the simulator computes a "score" based on the GFP produced by the receivers.The higher the score, the better the placement.The score is the radius of the largest circle that can fit within the logic area centered at an Opentrons position, where the logic area is the area in which the logic gate assigned to the given receiver is as predicted given the simulation and the user specified thresholds between ON and OFF.A larger score indicates a better match to the desired logic gate over a larger contiguous area.If there are multiple receivers the overall score is the minimum of the individual scores for the receivers.The evolutionary algorithm starts by generating a population of random placements.Each placement is then evaluated using the simulator, and a score is assigned to it.The placements are then ranked based on their scores, and a new generation of placements is created.The top 45% of placements (i.e., those with the highest scores) are selected for "reproduction".During reproduction, placements are randomly altered (or "mutated") to create new placements, which are then added to the next generation.Placements with scores of zero are considered non-viable and are replaced with entirely new random placements.The bottom 10% of placements, regardless of their score, are also replaced with new random placements to introduce more diversity into the population.This process of evaluation, selection, and reproduction continues for a predetermined number of generations.The hope is that, over time, the algorithm will evolve placements that yield higher scores.At the end of the process, the placement with the highest score is considered the optimal placement.The positions of the inducers and receiver colonies in this optimal placement, along with the associated synthetic genetic circuits and thresholds, are then saved for further analysis or experimental validation.

The relative importance of the activation functions for the Macchiato algorithm
The output of the Macchiato algorithm is an OR of the outputs of a set of colonies.This means it is important that input states that are required to be mapped to OFF are not mapped to ON by any of the output colonies.However, an input state that is required to be mapped to ON can be mapped to OFF by multiple output colonies, and as long as it is covered by another receiver this won't affect the functions output.The highpass and lowpass can only be used if a block of input states that need to be mapped to ON contain ALL 1 and or ALL 0 respectively i.e. the minimum and maximum signal concentration.As a non-zero amount of signal is required to activate the bandpass, the lowpass is required to map the state ALL 0 to ON, whereas the input state ALL 1 could be covered by the bandpass if required.The bandstop is only required where two blocks of input states contain ALL 0 and ALL 1 and is effectively a convenience that will simplify a number of functions.This means that the activation functions are ranked bandpass, lowpass, highpass, bandstop in order of most important to least important.This intuition was demonstrated and the capability of our approach to build the three-input logic gates was analysed.The Macchiato algorithm was applied to each three-input gate using four different sets of available activation functions.For each set of activation functions, the number of bacterial colonies required to build each gate was found (Table S4).With access to all four activation functions the Macchiato algorithm can achieve all three-input logic gates using one or two output colonies.If we remove the bandstop from the available activation functions we retain the capability to do all the three-input logic gates, however a larger proportion now require two output colonies as expected.If we remove the bandpass activation we are unable to construct the majority of the three-input logic gates, demonstrating the importance of the bandpass.This shows that experimental effort should be prioritised to developing first the bandpass and lowpass.Inferring the number of cells To calculate the number of cells to an arbitrary unit we use the Beer-Lambert law which states that the concentration of a sample is proportional to the absorbance of the sample.Assuming no scattering by the sample = log( where A is absorbance, T is transmittance, and ρ inc and ρ trans are the incident and transmitted light respectively.If we make the same simplifying assumption for transmitted light as we did for incident light i.e. that the measured pixel value is proportional to the light transmitted through the sample Inferring GFP concentration To calculate the concentration of GFP is more complex.Light from the blue LED panel is shone on the sample from underneath, some of that light is attenuated (as with the red light), some of the light excites the GFP in the sample and is emitted at a different wavelength.Any blue light that is transmitted through the sample is mostly, though not completely, blocked by the yellow filter.The green light that is fluoresced is also partially blocked by the yellow filter, though to a much lesser extent.We model the measured light as the sum of the above where the ks are coefficients giving the proportion of light of the given colour that can pass through the yellow filter, ρ trans is the light transmitted through the sample, and ρ em is the light emitted by the sample.Given our initial assumptions that N cells (t = 0) ≈ 0 and ρ inc ∝ pixel(t = 0) pixel blue (t = 0) ∝ ρ trans (t = 0) k blue (5) We make the simplifying assumption that the absorbance value is the same when using the blue LED as with the red LED.As such We use a model for fluorescence in which where ϕ is the detectable emitted light per unit fluorophore per unit incident light and N GFP is the number of fluorophores.Note here that our ϕ encompasses three constants that would normally be included in such a model: fluorescence efficiency, molar absorptivity, and the volume element of detection.Then, from the above equations Which gives us N GFP up to a constant 2.3 Mathematical modelling

The finite difference method
The finite difference method solves a system of differential equations by discretising the area of interest and, when using the forward Euler approximation, steps forward through time to simulate dynamics.The diffusion equation for the concentration of a diffusible molecule I is given by: ∂I(r, t) ∂t = D∇ 2 I(r, t) + S(r, t) where r is the position t is time, D is the diffusion coefficient, ∇ 2 is the Laplace operator and S(r, t) is a term that accounts for sources of the molecule.To solve this, the domain of interest is split into a finite number of discrete grid points and an approximate solution is calculated for each point on the grid.The finite difference for a forward time, central space approximation for the diffusion equation in two dimensions is: where I i,j,k is the concentration of diffusible molecule at spatial coordinates i, j and timestep k, ∆x and ∆y are the size of the discrete points in the x and y directions and ∆t is the timestep between successive iterations.This can be written as: where I k and S k are vectors of the current concentration and source terms at all the discretisation points at timestep k respectively and H is a matrix which operates on I to give the central space difference relations.Using this formula we can start from an initial concentration of diffusible molecule,I 0 , and iterate through time.The production of additional molecules by bacterial colonies can be included via the source term S k .This method can be used to model the production of and response to multiple diffusible communication molecules by bacterial colonies.

Simple finite difference model
A simple finite difference model was developed to verify whether the four activation functions could be used to represent digital logic gates where the inputs are represented by instantaneous sources of diffusible molecules.A square domain was discretised into 200 slices in both the a and y directions to give 40,000 square regions over which the diffusion equation was solved.To simulate the inputs instantaneous sources of diffusible molecule were applied to the initial conditions of the simulation.In this model all units of concentration, distance and diffusion speed were arbitrary.The four activation functions are represented by their digital approximations and assign each square region to OFF or ON as a function of the concentration of signalling molecule in that region.For each of the input states 00, 01, 10 and 11 a simulation was done and each activation functions was applied to the resultant concentration fields.The activation patterns of each activation function were combined to map the resulting logic gates onto each position on the simulation area.This effectively results in a map of the two-dimensional area around the inputs, where the colour of each region represents the logic gate that would be encoded by an receiver if it was placed within that region.

Calibrated finite difference model
To design logic gates we developed a more complex finite difference model which models cell growth and production of GFP using realistic activation functions.

Zong et al. steady-state model
In the paper, the system is modelled using Hill like functions with some adaptation.The output from each of the two P TAC promoters is described by where Input I is IPTG and each promoter has different parameters to take account of their differing context.The output from these promoters is T7 (Activator) and PhlF (Repressor).The output of the third promoter is GFP which is given by Simple dynamic model We will reuse most of the steady-state model but add a dilution term due to bacterial growth and division.We can even reuse most of the parameters with some simple modifications.Let's assume we have a simple model of gene expression in which x is expressed from an inducible promoter at a maximal rate k α which is modified by some function F (in our model a Hill function), there is some leaky expression at a rate k β , and the cells are growing and dividing at a rate µ.The expression for the change in x over time t is If we assume that transcription factor binding/unbinding occurs much faster than transcription and translation, we can say that F (I) is only dependent on input I and not on t.As such, at steady state So we can get the dynamic production rates by multiplying the already known α and β terms, from the Zong steady state model, by the growth rate µ.We should consider that this relationship is only defined at steady state but we will use it across our timecourse.Moreover, we will use the time varying growth rate as this will naturally give us the highest protein expression rates when growth is fastest and reduce expression as cells move into stationary phase.This leads to our dynamic model When R = 0 at all time points this models the highpass response, for R > 0 this models a bandpass response.
Parameterising growth The model presented above describes protein expression within single cells.The population size therefore affects protein expression only through its effect on growth rate.We can infer the growth rate directly from the change in density of our cultures.It is, however, more robust to fit a growth model to this data and calculate growth rates from the parametrised model.
One such growth model is the Gompertz model.
where A is the maximum population size as t approaches infinity, µ m is the maximum specific growth rate and λ is the lag time.The derivative of this function can be used to calculate the growth rate at time t: Fitting the model To simulate the response of colonies of bacterial to IPTG inputs the finite difference method was used to solve the equations for dT7 dt , dR dt , dGFP dt , dA(r,t) dt and dA(r,t) dt .To simulate a well in the 6 well plate (a circle with diameter 35mm).A square of width 35mm was modelled, and circular boundary conditions were imposed such that IPTG could not diffuse outside of a circle with diameter 35mm.The square was discretised by splitting it into 1521 (39 by 39) squares each with an area of 0.9mm 2 .In comparison to the toy model above this is a much coarser grained model.This was done because the parametrisation procedure requires many model simulations and a model of greater size would've been prohibitive in terms of simulation time.The decision to split the area into 39 slices on each axis also means that a distance of 4.5mm (the spacing of the Opentron positions) is almost exactly equal to 5 grid points, meaning we can place inputs and receivers in the correct place despite using a more coarse grained model.The configuration of IPTG and receivers used in the characterisation experiments (Figure S5a and the concentration of N , T 7, R, GFP and A was solved for each square using the Runga-Kutta finite difference method by calling the solve˙ivp method from Scipy [13].The output of any given parametrisation of the model can be compared with characterisation data using the sum square error of the predicted GFP levels with the observed florescence.A particle swarm algorithm [14] was used which explored the space of parameters to minimise the sum square error.This was repeated for both the the highpass and bandpass receivers.For the highpass the following parameters were fitted: D A , α T , β T K IT , n IT , K lacT , T 7 0 , α G , β G , n A , K A , G s .For the bandpass the following parameters were fitted: The results of the model fit for the highpass and bandpass receivers are shown in Figures S11 and S12 respectively.We found that at high concentrations of IPTG the output of the highpass receiver

Supplementary Figure 3 :
Dose response curves of IPTG inducible highpass and bandpass circuits in liquid and solid culture.Solid culture data is shown at 16 hours.Supplementary Table4: Receiver activation function requirements for three-input digital function computation.The second column shows the number of functions not computable.The last two columns give the number of functions with 1 and 2 required receivers (HP: highpass, LP: lowpass, BP: bandpass, BS: bandstop)

Supplementary Figure 4 :
Field experiments exploring the effects of IPTG concentration and distance between inputs.A) Decreasing IPTG concentration from right to left starting at 60 mM with a two-fold reduction each step.B) Increasing the distance between two droplets of IPTG, starting at a distance of 4.5 mm and increasing by 4.5 mm each step.Images shown here were processed as described in the Experimental Methods section.Calculated logic functions at each position are shown below.

Supplementary Figure 5 :Supplementary Figure 6 :Supplementary Figure 7 :Supplementary Figure 9 :
(a) The layout of the characterisation experiment.The inducer is pipetted in the centre of the well (blue) and eight receivers are pipetted at different distances (green).(b) Response of bandpass receiver colonies, at different distances from the IPTG input, over time.Each panel shows a different mM concentration of IPTG.Each line passes through the mean of three replicates measured at a given timepoint.Characterisation of the 3OC6-HSL sensitive receiver strains.A 1 µL droplet of 3OC6-HSL at the given concentration is dispensed as the input.Points and error bars show the mean and standard error of three replicates at 20 hours.Lines show hierarchically fitted Hill functions.Characterisation of the 3OC6-HSL producing sender strains.A 1 µL droplet of sender culture was placed at the centre of each well.The agar was impregnated with the given concentration of the respective inducer.Each combination of sender and receiver was characterised independently i.e. the top left panel shows the arabinose sender with highpass receiver.Points and error bars show the mean and standard error of three replicates at 20 hours.Lines show hierarchically fitted Hill functions.Note that at high arabinose concentrations the model does not fit the data well, as the high concentrations led to growth defects in the sender colonies.Fresnel Lens Sample LED light source (blue = 450 nm, red = 632 nm) loopbio imaging platform for visualisation of growth and fluorescence of bacterial colonies ρ inc ρ trans ρ inc ρ trans + ρ em Supplementary Figure 10: Light transmission through and emission from an illuminated sample.
1.The first non-zero state in the current sequence (first state) is combined with the item after it (second state) by concatenating both input states and sorting alphabetically.If the combined state already exists in the current sequence, increment the index of the second state by one

1 :
input: single input list 2: global queue ▷ queue order contains all the list of orders 3: queue.put([])▷ start with empty sequence 4: while True do queue.put(list of order+[item]) ▷ Add one of the next possible state to the end of the list, and put the list to the end of the queue order 14:return queue ▷ all possible orders of states Using the Algorithm 3, we can determine the valid input state orderings for a given number of inputs (Supplementary

Table 1 :
Number of possible input state orders that meet the constraints of our paradigm

Table 3 :
: Number of possible logic gates with a single receiver.Due to the constraints of signal concentration, it is impossible to produce all logic gates with 1 receiver for more than 2 inputs.We investigate whether we can apply logic optimisation to produce complex logic from the OR of simpler logic functions.To obtain the possible two receiver logic gates an OR operation was performed on each single receiver gate with every other single receiver gate.To obtain the three receiver gates each OR operation was done between each single receiver gate and each two receiver gate and so on.The results are shown in Supplementary Table3.Number of logic gates from each number of outputs